Networking And PC Setup Troubleshooting
Multiple computer setup cannot communicate via DDS
Problem: In a multiple computer setup, I can ping the other computer but can't communicate with it via ros2 run demo_nodes_py talker
+ ros2 run demo_nodes_py listener
Solution: Check that your network interface adapter and peer address entries for your method of configuring DDS are correct as explained in the DDS configuration docs.
ros2 topic list
and other CLI verbs report incorrect results
Problem: In a multiple computer setup or single computer setup where I use ROS on the host filesystem, commands like ros2 topic list
report incorrect results.
Solution: This is an artifact of how the ROS 2 CLI caches DDS information to be responsive via a "daemon".
The daemon is auto-started the first time a CLI command needs it or manually using ros2 daemon start
.
The system-wide daemon uses the DDS configuration present at the time of startup.
If you are using our default DDS configuration, which does not modify your host filesystem, this means the daemon starting on your host filesystem will use the ROS 2 DDS defaults, which will result in (a) topics from other computers that are multicasting on your network being listed and (b) topics from docker containers NOT being listed.
To fix this, the daemon must be started within the MoveIt Pro docker container in order for its DDS settings to be used.
# From `moveit_pro shell` or `moveit_pro dev`
ros2 daemon stop
ps aux | grep ros2cli\.daemon | grep -v grep # Verify the only result is the ros2cli.daemon process
ros2 daemon start
Importantly, this behavior is limited to the ROS 2 CLI verbs. MoveIt Pro's default DDS settings for ROS processes in the container only communicate with nodes in the container or on your host filesystem, not other computers on the network which are multicasting. Refer to this ros2cli issue for more details.
Multiple computer setup error messages that "Action server not available"
Problem: Multiple computers are connected over the network, but the agent cannot seem to access action servers or services provided by the robot drivers, or discover controllers, resulting in errors such as:
[ERROR] [1658269925.062171689] [MoveGripperAction]: Error code: -25; Message: Failed to send trajectory execution action goal to server.; Details: Action server not available.
...
[ERROR] [1658269931.800549146] [Teleoperate]: Error code: 99999; Message: Failed to enable required controller servo_controller; Details: Service /ensure_controller_is_active failed with message: Failed to ensure that specified controllers are active.
...
[ERROR] [1658269939.054476603] [moveit_ros.trajectory_execution_manager]: Controller '/joint_trajectory_controller' is not known
Solution: Check that your network interface adapter and peer address entries for your method of configuring DDS are correct as explained in the DDS configuration docs.
Multiple computer setup does not have synchronized clocks
Problem: When using multiple computer setups, the computers' clocks go out of sync. This can been seen by messages arriving from the future. Or TF complaining about old messages.
Solution: To solve this, one or both of the machines should install chrony
:
sudo apt-get install chrony
If any machine in the setup does not have access to NTP, the machine that has chrony
installed can serve as a NTP server.
Suppose the drivers RTPC does not have an internet connection, but has a direct link to the agent PC.
On the main PC (usually the PC running the Agent), install chrony
and then add lines to /etc/chrony/chrony.conf
to allow the drivers RTPC to connect to it.
For example:
# Allow drivers PC to use the agent PC as NTP server
allow 192.168.1.2 # This should be the IP address of drivers RTPC
Then restart the service:
sudo systemctl restart chrony.service
On the drivers RTPC, reconfigure timesyncd
to use the agent PC as an NTP server.
Modify /etc/systemd/timesyncd.conf
to have:
[Time]
# Use the Agent PC as the primary NTP source
NTP=192.168.1.1 # This should be the IP address of the agent PC
And restart the service:
sudo systemctl restart systemd-timesyncd.service